Title: Haskell Magic: Coercible, ala, Monoids and all that
Date: 2018-10-08
Modified: 2018-10-08
Category: Blog
Slug: haskell-magic-coercible-ala
Tags: haskell-magic, haskell
Summary: Where I describe a penny-drop moment that too me way too long

So, I recently [gushed about ``Foldable``][1], thanks to a significant
penny-drop moment in my own Haskelling. Today, I figured I'd share one more,
which leads on from the ideas related to ``Foldable`` and monoids I canvassed
previously, but adds on a few helpful and clever ideas. Let's get to it.

## Newtypes and ``Coercible``

In Haskell, the '``newtype`` wrapper' is a common idea; something like:

    :::haskell
    newtype Foo a = Foo { unFoo :: Bar a }

While most ``newtype``s are much more meaningful than that, the idea is common
for two reasons:

  1. They allow us to define a different API for ``Foo`` than the underlying
     ``Bar``, ensuring that inappropriate uses are made impossible; and
  2. For types which can be (lawful) members of type classes in more than one
     way, we can tell the compiler which one we want.

``foldMap`` in particular benefits from the latter use of ``newtype``, as it
allows us to vary the behaviour of ``Foldable``s full of ``a``s by wrapping
those ``a``s in a different monoid. As many types can be monoids in many ways
(even ``Bool`` has two ways it can be a monoid), this can give us a wide range
of behaviours, as we've seen. However, this isn't the only reason to use
``newtype``s this way: another good example are the [two different instances of
``Applicative`` for lists][2].

When we're using ``newtype``s in this way, we're essentially telling the
compiler something like 'I want you to do this function in this way, but in
reality, these two things are the same'. Thus, in an ideal world, once we've
figured out what functions to call to make our type behave the way we want, the
compiler can forget all about the ``newtype``s and just work directly with the
'underlying' type. While in theory, this is exactly what the compiler does, in
practice, [this can be confounded a bit][3], even for simple cases (for
example, lists of ``newtype``s). This forces us to have a runtime cost for
operations that _really_ ought to be 'free', which is undesirable. Since GHC
7.8.1, however, we don't need to - thanks to ``Data.Coerce.Coercible``. 

So what's the deal here? Essentially, ``Data.Coerce`` provides a special
``Coercible`` type class, whose instances _cannot_ be defined by the user, and
which doesn't persist beyond compile time, for any types which are represented
the same way. [The exact rules for this are quite specific][4], but for our
purposes, we need only really remember that, whenever we have:

    :::haskell
    newtype Foo = Foo { unFoo :: Bar }

Then GHC generates instances equivalent to:

    :::haskell
    instance Coercible Foo Bar
    -- plus an appropriate coerce :: Foo -> Bar
    instance Coercible Bar Foo
    -- plus an appropriate coerce :: Bar -> Foo

Again, these instances don't _really_ exist, and ``coerce`` is merely an
instruction to treat the same representation differently for the purposes of
dispatch (deciding what functions get called) - none of this translates to
runtime at all. We can do some fancier things with this, but for our purposes,
this will do.

## Directing ``coerce`` with ``TypeApplications``

Unfortunately, ``coerce`` is a _highly_ polymorphic function - its type is
effectively:

    :::haskell
    coerce :: (Coercible a b) => a -> b

This can sometimes throw GHC for a loop when writing polymorphic functions -
even if we know exactly what we actually want the coercion to be. To
give the compiler a helping hand, we would usually provide a type signature or
an argument, but this can be awkward. [``TypeApplications``][5] aims to resolve this
problem by giving us some syntax to allow us to state things more directly.

To demonstrate, let's consider a GHCi session. We'll set ourselves up as so:

    :::haskell
    import Data.Coerce
    :set -fprint-explicit-foralls
    :set -XTypeApplications

This gives us access to ``coerce``, and also enables a bit of helpful output to
allow us to understand how to use ``TypeApplications`` more easily. We also have
to enable the extension to use it. Let's see some examples of
``TypeApplications`` in action:

    :::haskell
    :t coerce
    -- coerce :: forall {a} {b} . Coercible a b => a -> b
    :t coerce @Int
    -- coerce @Int :: forall {b} . Coercible b Int => Int -> b
    :t coerce @_ @Int
    -- coerce @_ @Int :: forall {w} . Coercible w Int => w -> Int
    :t foldMap
    -- foldMap 
    --   :: forall {t :: * -> *} {m} {a}.
    --      (Foldable t, Monoid m) =>
    --      (a -> m) -> t a -> m

Let's take a closer look at that last one. Thanks to
``-fprint-explicit-foralls``, GHCi shows us exactly what type variables we have,
and their kinds (as indicated by ``{t :: * -> *}``. Normally, this wouldn't be  
helpful, but with ``TypeApplications``, it shows us _where_ we have to use
the ``@`` notation to 'fix' the type we want. For example, suppose we wanted to
know what type ``foldMap`` would have if our ``Monoid`` was ``Sum Int``. As ``m`` 
is the _second_ argument listed, we would need to write this:

    :::haskell
    import Data.Monoid
    :t foldMap @_ @(Sum Int)
    -- foldMap @_ @(Sum Int)
    --   :: forall {w :: * -> *} {a} .
    --      Foldable w =>
    --      (a -> Sum Int) -> w a -> Sum Int

Any 'trailing' type variables can be left alone, but any 'leading' ones we don't
care about need to be 'filled in' with ``@_``, which says 'I don't care what
this is'. We also need parentheses if our type is more complex than a single
word (as in the case above). Thus, we can use ``TypeApplications`` to 'fix' the
excessive polymorphism of functions like ``coerce`` that confuse GHC, as we
showed above.

## Making ``Coercible`` work for us

One of the downsides of using ``foldMap`` is the constant need to do something
like this:

    :::haskell
    count :: (Foldable t) => (a -> Bool) -> t a -> Int
    count p = getSum . foldMap (Sum . fromEnum . p)

What we really want to be able to do is say something like:

    :::haskell
    count p = foldMap (fromEnum . p)

However, this won't work, as our compiler can't possibly read our minds and
figure out _which_ monoid definition of ``Bool`` we want. We can reduce some of
this boilerplate using the [``coercible-utils`` package][6], which provides
this wonder-weapon of a function:

    :::haskell
    ala' :: (Coercible a b, Coercible a' b') => (a -> b) -> ((d -> b) -> c ->
    b') -> (d -> a) -> c -> a'

When I first saw this function, I was a bit taken aback - it was difficult to
see what on earth it did! However, it's really not all that scary.
While its uses are many and varied, for our purposes, it's sufficient to see it
as a way of abstracting away the wrapping and unwrapping of ``Monoid``
``newtype``s used with functions like ``foldMap``. Essentially, ``ala'`` wants
three arguments:

  1. A 'witness' of the ``Coercible a b`` (which is never actually used for
     anything);
  2. A 'higher-order function' which we want to operate with; and
  3. A 'preprocessing function' to massage our input into a more appropriate
     form.

Let's see how we can derive ``count`` using ``ala'``, with the help of GHCi once
again. Our 'higher-order function' in this case is ``foldMap``:

    :::haskell
    :t ala'
    -- ala'
    --  :: forall {a} {b} {a'} {b'} {d} {c}.
    --     (Coercible a b, Coercible a' b') =>
    --     (a -> b) -> ((d -> b) -> c -> b') -> (d -> a) -> c -> a'
    :t \witness preprocessor -> ala' witness foldMap preprocessor
    -- \witness preprocessor -> ala' witness foldMap preprocessor
    --   :: forall {t :: * -> *} {b'} {a} {a'} {d}.
    --      (Foldable t, Monoid b', Coercible a b', Coercible a' b') =>
    --      (a -> b') -> (d -> a) -> t d -> a' 

This is starting to get us somewhere. We also know what we want in the end (an
``Int``), so let's use ``TypeApplications`` to help. This is a bit wordy in
GHCi, but isn't an issue in practice, as we usually give type signatures to
functions in source code:

    :::haskell
    let partialCount :: (Foldable t, Monoid w, Coercible a w, Coercible a' w) => (a -> w) -> (d -> a) -> t d -> a'; partialCount witness preprocessor = ala' witness foldMap preprocessor
    :t partialCount
    -- partialCount
    --  :: forall {t :: * -> *} {w} {a} {a'} {d}.
    --     (Foldable t, Monoid w, Coercible a w, Coercible a' w) =>
    --     (a -> w) -> (d -> a) -> t d -> a'
    :t partialCount @_ @_ @_ @Int @_
    -- I rewrote this one with some renaming to avoid confusion - your GHCi may
    -- output something else.
    -- partialCount @_ @_ @_ @Int @_
    --  :: forall {t :: * -> *} {m} {w3} {a}.
    --     (Foldable t, Monoid m, Coercible w3 m, Coercible m Int) =>
    --     (w3 -> m) -> (a -> w3) -> t a -> Int

Based on the signature we just saw, we can see that our ``Monoid`` instance must
be coercible to ``Int``. We happened to use ``Sum Int`` in our previous
definition of ``count``, so we have a witness readily available: ``Sum``. Now we
just have to convert the elements of our ``Foldable`` into ``Int``, which we can
do using ``fromEnum . p``. Putting it all together, we get:

    :::haskell
    :t \p -> ala' Sum foldMap (fromEnum . p)
    -- \p -> ala' Sum foldMap (fromEnum . p)
    --   :: forall {t :: * -> *} {a1} {a'} {a2}.
    --      (Foldable t, Enum a1, Coercible a' Int) =>
    --      (a2 -> a1) -> t a2 -> a'

Now, this is actually _more_ general than ``count`` as we defined it before, but
it works just fine. If we substitute ``Bool`` for ``a1`` and ``Int`` for ``a'``,
we recover ``count`` just as we had it. Thus, we can write it as

    :::haskell
    count p = ala' Sum foldMap (fromEnum . p)

No wrapping or unwrapping necessary.

``coercible-utils`` also provides several helper functions. In many cases,
``ala'`` is too general, as we don't really need a 'preprocessor'. Consider, for
example, possible default of ``and`` and ``or``:

    :::haskell
    and :: (Foldable t) => t Bool -> Bool
    and = ala' All foldMap id

    or :: (Foldable t) => t Bool -> Bool
    or = ala' Any foldMap id

Having to inject that ``id`` in both cases should really not be necessary to do
manually. Hence the more restrictive cousin of ``ala'``:

    :::haskell
    ala :: (Coercible a b, Coercible a' b') => (a -> b) -> ((a -> b) -> c -> d')
    -> c -> a'
    ala f hof = ala' f hof id

So now, we can rewrite those as:

    :::haskell
    and = ala All foldMap
    or = ala Any foldMap

In fact, these definitions are more general than the ones in ``Data.Foldable``:

    :::haskell
    -- without -fprint-explicit-foralls
    :t ala Any foldMap
    -- ala Any foldMap :: (Foldable t, Coercible a' Bool) => t Bool -> a'

We don't have to give back a ``Bool`` as a result of this function -- anything
that's coercible to ``Bool`` will do. This is another advantage of using ``ala``
and ``ala'``; we can potentially 're-wrap' into _different_ ``newtype``s after
each call if we need to. This is an extra layer of generality _on top of_ what
``foldMap`` already gives us; this is why I've been using ``foldMap`` everywhere
recently, along with ``coercible-utils``.

## Rounding off

Thus ends my description of my penny-drop moment about ``Coercible``,
``TypeApplications`` and ``ala'``. Each of these seem tailor-made for the
others, and thanks to ``coercible-utils``, can be used to do some amazing
things. I encourage everyone to check out ``coercible-utils`` - if nothing else,
you'll find some really good tricks for working with newtypes in it, and likely
might be able to solve some of your problems much more concisely. Think hard,
and have fun.


[1]: https://retro-freedom.nz/haskell-magic-from-semigroup-to-foldable.html
[2]: https://wiki.haskell.org/Typeclassopedia#Instances_2
[3]: https://wiki.haskell.org/GHC/Coercible#The_problem
[4]: https://ghc.haskell.org/trac/ghc/wiki/Roles#Coercible
[5]: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#extension-TypeApplications
[6]: https://hackage.haskell.org/package/coercible-utils-0.0.0
